Google's DataGemma models address the issue of hallucinations in LLMs by grounding them in real-world data from the Data Commons knowledge graph. Two approaches are used: Retrieval Interleaved Generation (RIG) and Retrieval Augmented Generation (RAG). RIG fine-tunes the model to identify statistics and verify them against Data Commons, while RAG retrieves relevant information before the LLM generates text.
Monday, September 16, 2024